CML progression is defined by increased disease burden and potential for transformation to blastic phase (BP), yet the mechanisms and ability to predict this evolution remain elusive. We applied elements of state-transition theory to model transcriptional dynamics underlying leukemic progression, where individual cells' gene expression profiles represent microstates, and aggregated population-level gene expression profiles define macrostates that govern phenotypic transitions from health to disease initiation to overt leukemia. Previously, we demonstrated through bulk transcriptomic analyses that CML progression and therapeutic responses can be accurately predicted by modeling a leukemic potential landscape derived from transcriptional state-space geometry.

To extend this framework to single-cell (sc) resolution and capture more granular information at the microstate level, we collected weekly peripheral blood samples from inducible chronic phase (CP) and BP CML mouse models to generate time-series scRNA-seq data tracking the transition from health to overt CP or BP. After performing quality control and cell labeling, we used the first (healthy) time point before the induction of BCR::ABL (T0) and the final leukemia time point (Tf), in addition to other intermediate time points, to assess changes in cell populations and gene expression. All analyses were first performed in CP mice, validated in BC mice, and corroborated using two independent human CML scRNA-seq datasets.

By comparing sequential time points, we observed that progression to CP CML was associated with a significant decrease in B cells (p<0.01) and increases in both myeloid (p<0.01) and stem cell (p<0.001) populations. Differential gene expression (DEG) analysis revealed marked transcriptional changes across all major lineages (DEGs: B cells = 781; T cells = 2,149; Myeloid = 1,999; Stem cells = 194). Despite these changes, no single-cell transcriptional state at any time point was uniquely associated with health or disease, as transcriptional profiles of leukemic cells largely overlapped with those from healthy cells. However, when scRNA-seq data were computationally aggregated into pseudobulk (PsB) samples, mimicking bulk RNA-seq, a distinct CML state-space emerged, revealing a clear disease trajectory defined by three stable macrostates representing early, transitional, and late leukemia. Cell type–specific PsB analyses uncovered independent state-transition dynamics in B cell, T cell, myeloid, and stem cell compartments, highlighting previously unrecognized complexity in disease evolution involving multiple lineages. We further showed that the transition from health to leukemia at the PsB level could be reconstructed as a linear combination of transitions within individual cell subpopulations.

To this end, we quantified each lineage's contribution to disease progression by performing computational simulations that subtracted the influence of each cell type in turn. Interestingly, B and myeloid populations contributed most and comparably to the global disease trajectory—despite B cells decreasing and myeloid cells expanding over time. This counterintuitive finding suggests that leukemic information is encoded not simply by abundance but by dynamic transcriptional shifts across compartments. These findings were validated in a BP mouse model and in two human CML datasets.

In summary, we introduce a conceptual framework that distinguishes between microstates (individual cells) and macrostates (population-level transcriptional ensembles), showing that disease progression is not encoded at the single-cell level but instead emerges only when cells are aggregated. While this may appear intuitive, we were surprised to find that higher resolution sc data did not provide clearer insight than pseudobulk data. We speculate that this may reflect greater Shannon information entropy at the single-cell level, which introduces variability that obscures coherent state transitions. To our knowledge, this framework is novel and may inform future approaches to interpreting scRNA-seq data in leukemia and other diseases.

This content is only available as a PDF.
Sign in via your Institution